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Abstract — A recursive algorithm named Zero-point Attract- 
ing Projection (ZAP) is proposed recently for sparse signal 
reconstruction. Compared with the reference algorithms, ZAP 
demonstrates rather good performance in recovery precision 
and robustness. However, any theoretical analysis about the 
mentioned algorithm, even a proof on its convergence, is not 
available. In this work, a strict proof on the convergence of ZAP is 
provided and the condition of convergence is put forward. Based 
on the theoretical analysis, it is further proved that ZAP is non- 
biased and can approach the sparse solution to any extent, with 
the proper choice of step-size. Furthermore, the case of inaccurate 
measurements in noisy scenario is also discussed. It is proved that 
disturbance power linearly reduces the recovery precision, which 
is predictable but not preventable. The reconstruction deviation 
of p-compressible signal is also provided. Finally, numerical 
simulations are performed to verify the theoretical analysis. 

Index Terms — Compressive Sensing (CS), Zero-point Attract- 
ing Projection (ZAP), sparse signal reconstruction, £i norm, 
convex optimization, convergence analysis, perturbation analysis, 
p-compressible signal. 



I. Introduction 
A. Overview of CS and Sparse Signal Recovery 

Compressive Sensing (CS) IT], IJ] is proposed as a novel 
technique in the field of signal processing. Based on the 
sparsity of signals in some typical domains, this method takes 
global measurements instead of samples in signal acquisition. 
The theory of CS confirms that the measurements required 
for recovery are far fewer than conventional signal acquisition 
technique. 

With the advantages of sampling below Nyquist rate and 
little loss in reconstruction quality, CS can be widely applied 
in the regions such as source coding ||3], medical imaging ||4|, 
pattern recognition ||5], and wireless communication ||6]. 

Suppose that an A^-dimensional vector x e R^ is a sparse 
signal with sparsity S, which means that only S entries of x 
are nonzero among all N elements. An M x N measurement 
matrix A with M < N is applied to take global measurements 
of X. Consequently an 7\/ x 1 vector 



Ax 



(1) 



is obtained and the information of A^-dimensional unknown 
signal is reduced to the M-dimensional measurement vector 
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Exploiting the sparse property of x, the original signal can be 
reconstructed through y and A. 

The procedure of CS mainly includes two stages: signal 
measurement and signal reconstruction. The key issues are 
the design of measurement matrix and the algorithm of sparse 
signal reconstruction, respectively. 

On the signal reconstruction of CS, a key problem is to 
derive the sparse solution, i.e., the solution to the under- 
determined linear equation which has the minimal Iq norm. 



mill ||xj|o, subject to y = Ax. 



(Po) 



However, ( |Po| ) is a Non-deterministic Polynomial (NP) hard 
problem. It is demonstrated that under certain conditions Q, 
(|Po|) has the same solution as the relaxed problem 



min j|xj|i, subject to y = Ax. 



(Pi) 



jPif is a convex problem and can be solved through convex 
optimization. 

In non-ideal scenarios, the measurement vector y is inac- 
curate with noise perturbation and ([T) never satisfies exactly. 
Consequently, ( |Pi| l is modified to 

min ||x||i, subject to ||y — Ax||2 < e, (P2) 

X 

where e is a positive number representing the energy of noise. 

Many algorithms have been proposed to recover the sparse 
signal from y and A. These algorithms can be classified into 
several main categories, including greedy pursuit, optimization 
algorithms, iterative thresholding algorithms and other algo- 
rithms. 

The greedy pursuit algorithms always choose the locally 
optimal approximation to the sparse solution iteratively in 
each step. The computation complexity is low but more mea- 
surements are needed for reconstruction. Typical algorithms 
include Matching Pursuit (MP) IT], Orthogonal Matching 
Pursuit (OMP) H, 191, Stage-wise OMP (StOMP) 1 10|, Reg- 
ularized OMP (ROMP) m, [El, Compressive Sampling MP 
(CoSaMP) m, Subspace Pursuit (SP) QIl, and Iterative Hard 
Thresholding (IHT) 031 . 

Optimization algorithms solve convex or non-convex prob- 
lems and can be further divided into convex optimization 
and non-convex optimization. Convex optimization meth- 
ods have the properties of fewer measurements demanded, 
higher computation complexity, and more theoretical sup- 
port in mathematics. Convex optimization algorithms in- 
clude Primal-Dual interior method for Convex Objectives 
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(PDCO) He), Least Square QR (LSQR) HTl, Large-scale 
£i -regularized Least Squares (£i-ls) ifTSl . Least Angle Re- 
gression (LARS) [191, Gradient Projection for Sparse Recon- 
struction (GPSR) II20I . Sparse Reconstruction by Separable 
Approximation (SpaRSA) ||2T|, Spectral Projected-Gradient £1 
(SPGLl) d, Nesterov Algorithm (NESTA) ^ and Con- 
strained Split Augmented Lagrangian Shrinkage Algorithm 
(C-SALSA) 124). 

Non-convex optimization methods solve the problem of 
optimization by minimizing £p norm with < p < 1, which 
is not convex. This category of algorithms demands fewer 
measurements than convex optimization methods. However, 
the non-convex property may lead to converging towards the 
local extremum which is not the desired solution. Moreover, 
these methods have higher computation complexity. Typi- 
cal non-convex optimization methods are FOCal Underdeter- 
mined System Solver (FOCUSS) ^5\, Iteratively Reweighted 
Least Square (IRLS) ||26l and £0 Analysis-based Sparsity 
(LOAbS) l27l . 

A new kind of method. Zero-point Attracting Projection 
(ZAP), has been recently proposed to solve ( |Po| ) or ( |Pi| l 1281 . 
The projection of the zero-point attraction term is utilized to 
update the iterative solution in the solution space. Compared 
with the other algorithms, ZAP has advantages of faster 
convergence rate, fewer measurements demanded, and a better 
performance against noise. 

However, ZAP is proposed with heuristic and experimental 
methodology and lacks a strict proof of convergence ||281 . 
Though abundant computer simulations verify its performance, 
it is still essential to prove its convergence, provide the specific 
working condition, and analyze performances theoretically 
including the reconstruction precision, the convergence rate 
and the noise resistance. 

B. Our Work 

This paper aims to provide a comprehensive analysis for 
ZAP. Specifically, it studies £i-ZAP, which uses the gradient 
of £1 norm as the zero-point attraction term. 

The main contribution of this work is to prove the con- 
vergence of £i-ZAP in non-noisy scenario. Our idea is sum- 
marized as follows. Firstly, the distance between the iterative 
solution of ^i-ZAP and the original sparse signal is defined to 
evaluate the convergence. Then we prove that such distance 
will decrease in each iteration, as long as it is larger than a 
constant proportional to the step-size. Therefore, it is proved 
that ^i-ZAP is convergent to the original sparse signal under 
non-noisy case, which provides a theoretical foundation for 
the algorithm. 

Another contribution is about the signal reconstruction 
with measurement noise. It is demonstrated that £i-ZAP can 
approach the original sparse signal to some extent under 
inaccurate measurements. In the noisy case, the recovery 
precision is linear with not only the step-size but also the 
energy of noise. 

Other contributions include the discussions on some related 
topics. The convergence rate is estimated as an upper bound 
of iteration number The constraint of initial value and its 



influence on convergence are provided. The convergence of £1- 
ZAP for p-compressible signal is also discussed. Experiment 
results are provided to verify the analysis. 

The remainder of this paper is organized as follows. In 
Section II, some preliminary knowledge is introduced to 
prepare for the main theorems. The main contribution in non- 
noisy scenario is presented as Theorem 4 in Section III, which 
proves the convergence of £i-ZAP. Some related topics about 
Theorem 4 are also discussed in Section III. Section IV shows 
another main theorem in noisy scenario, and some discussions 
are also brought out. Experiment results are shown in Section 
V. The whole paper is concluded in Section VI. 

II. Preliminaries 
A. RIP and Coherence 

In this subsection. Restricted Isometry Property (RIP) and 
coherence are introduced and then some theorems on ( |Pi| i 
and \P2\ are presented, which will be helpful to the following 
content. 

Definition 1: ||29l Suppose A7- is the M x \T\ submatrix 
by extracting the columns of M x A^ matrix A corresponding 
to the indices in set T C {1, 2, ... , A^}. The RIP constant 63 
is defined as the smallest nonnegative quantity such that 

(l-(5s)||c||^<||Arc||2<(l + 5s)||c||2 

holds for all subsets T with \T\ < S and vectors c <E R'^L 

Theorem 1: lf30l If the RIP constant of matrix A satisfies 
the condition 

d2s<V2-l, (2) 



where 5* is the sparsity of x, then the solution of ( |Pi| l is unique 
and identical to the original signal. 

Theorem 2: 1301 If the RIP constant of matrix A satisfies 
the condition 

62s<V2-l, (3) 



then the solution x* of ( jP^I * obeys 

||x*-x«!|2<Cse, 



(4) 



where x^* is the original signal of sparsity S and Cs is a 
positive constant related to S. 

RIP determines the property of the measurement matrix. 
Recent results on RIP can be found in BTJI . l32l . 

Definition 2: l33l The coherence of an M x A^ matrix A 
is defined as 

^i(A) = max \a.^ a.j\, 

where ai(l < i < N) is the ith column of A and ||ai||2 = 1- 
Theorem 3: l35l . Il20l If the sparsity S* of x and the 
coherence of matrix A satisfy the condition 

then the solution of ([^J is unique. 

Theorem 1 provides the sufficient condition on exact recov- 
ery of the original signal without any perturbation. It is also 
a loose sufficient condition of the unique solution of ( |Pi| i. 
Theorem 2 indicates that under the condition (|3), the solution 
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of (|2J is not too far from the original signal, with a deviation 
proportional to the energy of measurement noise. Theorem 3 
provides a sufficient condition of the uniqueness of the solution 
of(E]). 



B. ii-ZAP 

In ZAP algorithm, the zero-point attraction term is used 
to update the iterative solution and then the updated iterative 
solution is projected to the solution space. The procedures of 
ZAP can be summarized as follows. 



pA/xAf 



aM 



Input: A e K^'^ "" , ye K^'^ , 7 G ii^+. 
Initialization: n = and xq = A^y. 
Iteration: 

while stop condition is not satisfied 

1. Zero-point attraction: 

x„+i = x„ - 7 • VF(x„) 

2. Projection: 

x„+i = x„+i + A^(y - Ax„+i) 

3. Update the index: n = n + 1 
end while 



(6) 



(7) 



In the initiaHzation and O, A^ = A'^(AA"'")~^ denotes 
the pseudo-inverse of A. In (|6]l, VF(x„) is the zero-point 
attraction term, where F(x) is a function representing the 
sparse penalty of vector x. Positive parameter 7 denotes the 
step-size in the step of zero-point attraction. 

ZAP was firstly proposed in ll28l with a specification of 
^o-norm constraint, termed ^q-ZAP, in which the approximate 
fo norm is utilized as the function F(x). £o-ZAP belongs to 
the non-convex optimization methods and has an outstanding 
performance beyond conventional algorithms. In ll28l . the 
penalty function is j|x||o and its gradient is approximated as 

VF,„(x)«[f(.Ti),f(a;2),---,f(a^Jv)]^ 



and 



f(^) 



-a^x — a, — - < X < 0; 
-a^x + a, < a; < — ; 



0, 



a' 



elsewhere. 



The piecewise and non-convex zero-point attraction term 
further increases the difficulty to theoretically analyze the 
convergence of ^q-ZAP 

As another variation of ZAP, £i-ZAP is analyzed in this 
work. The function F(x) is the Hi norm of x in the zero-point 
attraction term. Since it is non-differentiable, the gradient of 
F(x) can be replaced by its sub-gradient. Considering that the 
gradient of F(x) is sgn(x) when none of the components of 
x are zero, (|6]l can be specified as 



S-n+l 



7 • sgn(x„), 



(8) 



where the gradient is replaced by one of the sub-gradients 
sgn(x). The sign function sgn(x) has the same size with x 
and each entry of sgii(x) is the scalar sign function of the 
corresponding entry of x. 

Experiments show that though its performance is better than 
conventional algorithms, £i-ZAP behaves not as good as (,q 



norm constraint variation. However, as a convex optimization 
method, £i-ZAP has advantages beyond non-convex methods, 
as mentioned in introduction. ^i-ZAP is considered in this 
paper as the first attempt to analyze ZAP in theory. 

The steps ^ and (|7]i of ^i-ZAP can be combined into the 
following recursion 



x„+i = x„ - 7Psgn(x„) 



(9) 



with the projection matrix 

P = I- AT(AA'^)-1A. (10) 

Notice that following (|9]), ( fTol l and the initialization, the 
sequence has the property 



Ax„+i = Ax„ = Axq = y, Vn > 0, 



(11) 



which means all iterative solutions fall in the solution space. 

Numerical simulations demonstrate that the sparse solution 
of under-determined linear equation can be calculated by £1- 
ZAP In fact, the sequence {x„} calculated through (|9]l is not 
strictly convergent. {x„} will fall into the neighborhood of x* 
after finite iterations, with radius proportional to step-size 7. 
With the increasing of iterations, x„ approaches x* step by 
step at first. However, it vibrates in the neighborhood of x* 
when x„ is close enough to x*. If the step-size 7 decreases, the 
radius of neighborhood also decreases. Consequently, one can 
get the approximation to the sparse solution at any precision 
by choosing appropriate step-size. 

In this work the convergence of ^i-ZAP is proved. The 
main results are the following theorems in Section III and 
IV, corresponding to non-noisy scenario and noisy scenario, 
respectively. 

III. Convergence in Non-Noisy Scenario 

The main contribution is included in this section. A lemma 
is proposed in Subsection A for preparing the main theorem in 
Subsection B. Then the condition of exact signal recovery by 
£i-ZAP is given in Subsection C. Several constants and vari- 
ables in the proof of convergence are discussed in Subsections 
D and E. In Subsection F, an estimation on the convergence 
rate is given. The initial value of £i-ZAP is discussed in 
Subsection G. 

A. Lemma 

Lemma 1: Suppose that x e M^ satisfies y = Ax, with 
given A e M^^^^ and y G M.^^ . x* is the unique solution of 
( |Pi| ). If j|x — x*||2 is bounded by a positive constant A/q, then 
there exists a uniform positive constant t depending on A,y, 
and A/q, such that 



x||i — ||x 



> t x-x* 



(12) 



holds for arbitrary x satisfying y = Ax. 

The outline of the proof is presented here while the details 
are included in Appendix A. 
Proof: By defining 



?(x) 



X 1 



(13) 
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equation ( [12] ) is equivalent to the following inequality 

infg(x) >0, 

X 

subject to y = Ax and < ||x - x*j|2 < Mq. (14) 

Define the index set I ^ {k \ x*^ ^^ 0,1 < k < N}, then 
there exists a positive constant tq such that (sgn(x))2- = 
(sgn(x*))2;, when x satisfies 



x-x II2 < ro. 



(15) 



The above proposition means that x and x* share the same 
sign for the entries indexed by I. Consequently, for the sepa- 
rate cases of < ||x — x*j|2 < tq and vq < ||x — x*||2 < ^/o, 
it is proved that g(x) has a positive lower bound, respectively. 
Combining the two cases. Lemma [T] is proved. ■ 

B. Main Result 

Theorem 4: Suppose that x* is the unique solution of ( |Pi| i. 
x„_|_i and x„ satisfies the recursion (|9) and x„ is energy 
constrained by j|x„ — x*||2 < A/q, where A/o is a positive 
constant. Then the iteration obeys 



X„+l -X 



< x„ - x^ 



*'1-d7^ 



when 



where 



>A'7, 



K 



— max |jPsgn(x)|l 
2ixGK"" ^ ^ ^" 



d ~ {^i — 1) max ||Psgn(x) 

xeR" 



(16) 
(17) 

(18) 
(19) 



are two constants with a parameter /i > 1, and i > denotes 
the lower bound specified in Lemma 1 . 

For a given under-determined constraint ([!} and the unique 
sparsest solution of ( |Pi| i, Theorem |4] demonstrates the con- 
vergence property and provides the convergence conditions of 
£i-ZAP. As long as the iterative result x„ is far away from 
the sparse solution x*, the new result x„+i in next iteration 
affirmatively becomes closer than its predecessor Furthermore, 
the decrease in £2 distance is a constant d-f'^, which means 
x„ will definitely get into the (7^7) -neighborhood of x* in 
finite iterations. According to the definition of K, x„ can 
approach the sparse solution x* to any extent if the step-size 
7 is chosen small enough. Therefore, £i-ZAP is convergent, 
i.e., the iterative result can get close to the sparse solution at 
any precision. Here /x is a tradeoff parameter which balances 
the estimated precision and convergence rate. 

The proof of Theorem |4] goes in Appendix B. 

C. Exact Signal Recovery by (.i-ZAP 

Using Theorem 4 and conditions added, the convergence of 
^i-ZAP can be deduced, as the following corollary. 

Corollary 1: Under the condition ^, £i-ZAP can recover 
the original signal at any precision if the step-size 7 can be 
chosen small enough. 

Proof: Firstly, it will be demonstrated that the condition 
of energy constraint in Theorem 4 can always be satisfied. In 



fact, Mq can be chosen greater than ||xo — x*||2. If the energy 
constraint ||x„ — x*||2 < Afo holds for index n, the conditions 
of Theorem 4 are satisfied and then ||x„+i — x*||2 < Mq 
holds naturally according to (fTSI l. Consequently, it is readily 
accepted that the condition of energy constraint is satisfied for 
each index n, with the utilization of Theorem 4 in each step. 

Combining the explanation after Theorem 4, it is clear 
that the £i-ZAP is convergent to the solution of \Pi\ at any 
precision as long as the step-size is chosen small enough. 

According to Theorem 1, it is known that under the condi- 
tion of (|2|i, the solution of \Pi\ is unique and identical to the 
original sparse signal. Then Corollary 1 is proved. ■ 

According to Theorem 4 and Corollary 1, the sequence 
will surely get into the (A'7)-neighborhood of x*. In fact, 
because of several inequalities used in the proof, A'7 is merely 
a theoretical radius with conservative estimation. The actual 
convergence may get into a even smaller neighborhood. The 
details will be discussed in Subsection F 



D. Constant t,K, and the Extremum of ||Psgii(x)||2 

Involved in ( fTSb of Theorem 4, constant t is essential to the 
convergence of ^i-ZAP In fact, the key contribution of this 
work is to indicate the existence of this constant. However, one 
can merely obtain the existence of t from the proof of Lemma 
1, other than its exact value. Because x* in the definition 
of ( fT4b is unknown, it is difficult to give the exact value or 
formula of t, even though it is actually determined by A, 
y, and Mq. Whereas, an upper bound is given with some 
information about t, which leads to Theorem 5. 

According to ( fTSl l, constant K is inversely proportional to 
t. With a small t, the radius of convergent neighborhood is 
large and the convergence precision is worse. The maximum of 
||Psgn(x)||2 is also involved in the definition of K. According 
to the range of sign function, i.e. { — 1,0,1}, there are 3^ 
choices of vector sgn(x) altogether Similar to t, the extremum 
of ||Psgn(x)||2 is determined by A. 

The relationship between t and extremum of j|Psgn(x)||2 
is presented in Theorem 5. 

Theorem 5: If t is defined by ( fl4] i. one has the following 
inequality 



t < min |iPsgn(x)|i2 < max liPsgnfxlIU < VA^. 

xeR«,X7tO xGK" 

(20) 
The proof of Theorem 5 is postponed to Appendix C. 
According to the theorem, the minimum of ||Psgn(x)||2 
restricts the value of t, as leads to worse precision of £1- 
ZAP Hence, the measurement matrix A should be chosen with 
relatively large min ||Psgn(x)||2 to improve the performance 
of the mentioned algorithm. The mathematical meaning of 
Psgn(x) is the projection of sgn(x) to the solution space 
of y = Ax. For a particular instance, if there exists a sign 
vector, to whom the solution space is almost orthogonal, then 
the minimum of ||Psgn(x)||2 is rather small and the precision 
of convergence is bad. An additional explanation is that the 
solution space can not be strictly orthogonal to any sign vector, 
or else it will lead to a contradiction with the condition of (|2]i, 
i.e., the uniqueness of x*. 
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E. Discussions on /i and Bound Sequence 

A parameter ^ is involved in Theorem 4. We will discuss 
the choice of /i and some related problems. First of all, it 
needs to be stressed that ji is just a parameter for the bound 
sequence in theoretical analysis, other than a parameter for 
actual iterations. 

According to the proof in Appendix B, as long as jjl is 
chosen satisfying the conditions of /i > 1 and 

||x„ -x*||2 > -i— max ||Psgn(x)||^, 

It x6E" 

Theorem 4 holds and the distance between x„ and x* de- 
creases in the next iteration. However, considering the expres- 
sion of ( fT9] l. the decrease of ||x„ — x*||2 by each iteration is 
different for various /i. There are two strategies to choose the 
parameter /i, a constant or a variable one. 

When /i is chosen as a constant. Theorem 4 indicates that 
as long as the distance between x„ and x* is larger than A'7, 
the next iteration leads to a decrease at least a constant step 
of ^7^. 

When the parameter ^ is variable, the decrease step of ||x„— 
x*|J2 is also variable. The expressions show that K and d 
increase as the increase of ^. Notice that ^ must obey 



1< M< 



2t 



7 max ||Psgn(x)|J2' 

xeR" 



(21) 



where the right inequality is necessary to satisfy (flTt . which 
ensures the convergence of the sequence. During the very 
beginning of recursions, x„ is far from x*. Consequently, /i 
satisfying (ISTT i can be larger, and lead to a faster convergence. 
However, as x„ gets closer to x* by iterations, ^ satisfying 
^T\\ is definitely just a little larger than one. 

To be emphasized, the actual convergence of iterations can 
not speed up by choosing the parameter ji. The value of ji 
only impacts the sequence of 



Combining ( |23] | and (l24l i. the iteration of xj^ obeys 

II ' *l|2 II / * ||2 o 4-11 ' * I 

l|x„+i - X II2 =||x„ - X II2 - 27t||x„ - X II2 

+ 72max||Psgn(x)||2. (25) 

xeK" 

The distance between x^^ and x* with variable fi decreases 
the most for each step. Therefore, {xj^} has a faster conver- 
gence rate compared with sequences satisfying (|22] | with other 
choices of ji. However, as a theoretical result, it still converges 
more slowly than the actual sequence. 

Derived from Lemma 2, which gives a rough estimation. 
Theorem 6 provides a much better lower bound of the con- 
vergence rate. 

Lemma 2: Supposing {x„} is the iterative sequence by £1- 
ZAP, it will take at most 

2i- max ||Psgn(x)|l2/if,„i„ 

xGK" 

Steps for {x„} to get into the (A'niin7) -neighborhood from 
the (Ari„ax7)-neighborhood of x*, where A'max > A'min and 
iCniin must obcy 



K„,in > T- max ||Psgn(x)|| 



2- 



(26) 



2t xSi 

Theorem 6: Supposing {x„} is the iterative sequence by 
£i-ZAP, it will get into the (Aro7)-neighborhood of x* within 
at most 

Mo 



2X0 



Kg ^^ f A/q 

tl ' t ^\KojJ ' 2i- max ||Psgn(x)||2/Aro 

xeK" 

steps. Here Mq,'^, and t have the same definitions with those 
in Theorem 4, and Kq must obey 

1 

2txei 

The proofs of Lemma 2 and Theorem 6 are postponed to 
Appendix D and E, respectively. 



^^'0 > — max ||Psgn(x)||2. 



Wl-dj', 



(22) 



which is a sequence bounding the actual sequence in the proof 
of convergence. 



F. Convergence Rate 

Theorem 4 tells little about the convergence rate. Con- 
sidering several inequalities utilized in the proof, the actual 
convergence is faster than that of the sequence in (l22t . It means 
that a lower bound of the convergence rate can be derived in 
theory. 

Corresponding to the variable selection of fi, a sequence 
{x'^} is put forward with properties 



,*l|2 



,*l|2 



^n+1 



where 



2-7 if^n-l) max llPsgn(x) 

xGK" 



2 
2i 



2i||x' 



x^b 



7 max ||Psgn(x)||2 

xGK" 



> 1. 



(23) 
(24) 



G. Choice of the Initial Value 

In £i-ZAP, the initial value is the least square solution of 
the under-determined equation, 

xo = AT(AAT)-V. 

From Theorem 4 and Corollary 1, one knows that if the 
initial value obeys Axq = y, the iterative sequence {x„} 
is convergent. Therefore, the restriction to the initial value is 
to be in the solution space, other than to be the least square 
solution. However, it is still a convenient way to initialize using 
the least square solution. 

IV. Convergence in Noisy Scenario 

The convergence of £i-ZAP in noisy scenario is analyzed 
in this section. The main theorem in noisy scenario is given in 
Subsection A. In Subsection B, the problem of signal recovery 
from inaccurate measurements is discussed. Subsection C 
shows different choices of initial value and the impact on the 
quality of reconstruction. The reconstruction of p-compressible 
signal by ^i-ZAP is discussed in Subsection D. 
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A. Main Result in Noisy Scenario 

Considering the perturbation on measurement vector y. 
Theorem 7 is presented to analyze the convergence of £i- 
ZAP. Similar to Lemma 1, Lemma 3 is proposed at first 
corresponding to the noisy case. 

Lemma 3: Suppose that x £ M.^ satisfies ||y — Axj|2 < £, 
with given A G M*^ ^ ^ and y £ R*'' . x"^ is the unique solution 
of ([^}. i|x — x*j|2 is bounded by a positive number Mq. Then 
there exists a positive number t depending on A, y, Afo, and 
e, such that 



x*||i > i||x-x*l 



2- 



(27) 



Proof: With the definition of ( fT3] l. ( l27b is equivalent to 
the following inequality 

infg(x) >0, 

X 

subject to ||y - Axj|2 < e and < ||x- x*||2 < Mq. (28) 

Following the proof of Lemma 1, it can be readily proved 
that Lemma 3 is correct. Notice that here 



u = 



X — X'' 2 



is not in the null-space of A, but a unit vector satisfying 
i|Au||2 < 2e. The remaining procedures are similar The 
details of the proof are omitted for short. ■ 

Theorem 7: Supposing that x* is the unique solution of 
( IP2I 1, sequence {x„} satisfies the iterative formula (|9) with 
conditions 



and 



ly- Ax„||2 <e 



< Mo, 



(29) 



(30) 



where Mo is a positive constant. Then the iteration obeys 

||x„+i-x*||2<l|x„-x'^||^-ci7^ 



when 



> A'7 + Ce, 



where C = 2y/N\/t, K and d are defined by ^ and ^, 
respectively. Here /i > 1 is a parameter, t is the positive lower 
bound in Lemma 3, and A is the largest eigenvalue of matrix 

(AAT)-i. 

The proof of Theorem 7 goes in Appendix F. 

Theorem 7 indicates that under measurement perturbation 
with energy less than e, the iterative sequence {x„} will 
get into the {K^ + Ce)-neighborhood of x*. For the fixed 
original signal and measurement matrix, the precision of x„ 
approaching x* depends on both the step-size and the noise 
energy bound. It means that x„ can not get close to the solution 
X* at any precision by choosing small step-size, because the 
noise energy also controls a deviation component, Ce. 



B. Signal Recovery from Inaccurate Measurements 

Corollary 2 indicates the property of signal reconstruction 
with inaccurate measurements. 

Corollary 2: Suppose the original signal is x' £ R^, 
and the conditions of (|3]l and (|5]l are satisfied. There exist 
real numbers K > 0, C > such that ^i-ZAP can be 
convergent to a (A'7 + Ce) -neighborhood of x", i.e., £1- 
ZAP can approach the original signal to some extent under 
inaccurate measurements. 

Proof: Referring to the proof of Corollary 1, it can be 
readily accepted that the condition ( l30l l is always satisfied 
for any index n. It is known from Theorem 3 that ( IP2I ) 
has a unique solution under the condition (|5]l. Consequently, 
according to Theorem 7, the sequence {x„} finally gets into 
the neighborhood of x* with the radius K^ + Ce. 

Theorem 2 shows that under the condition of (|3), the 
solution of \P2\ is not far from the original signal x", with 
the inequality 

||x*-x«||2<C5e. (31) 

Combining Theorem 7, dJTT i. and the triangle inequality, one 
sees that the sequence gets into the neighborhood of x" with 
the radius ii'7 + (C + Cs)e. Denote C = C + Cs and the 
conclusion of Corollary 2 is drawn. ■ 



C. Initial Values 

Among the assumptions of Theorem 7, a condition of ( |29] l 
is assumed to be satisfied. Considering the recursion (|9), one 
readily sees that 



||y- Ax„||2 = ||y- AX0II2. 
Under the simple condition of 

||y- AX0II2 < e, 



(32) 



(33) 



where xq is not necessarily the least square solution of 
y = Ax, it will suffice to get ( |29] l, which satisfies the condi- 
tion of Theorem 7. 

If the initial value satisfies ([33]), by defining e„ = A(x„ — 
X*), one has 



|e„||2 = ||(y - Ax„) - (y - Ax*)||2 

<||y-Ax„||2 + ||y-Ax*||2<2£ 



(34) 



Inequality ( l34l i provides the upper bound of ||e„j|2 and it is 
used to prove Theorem 7. 

If the iterations begin with the least square solution of the 
perturbed measurement y, it obeys y = Axq and according 
to ( |32] ) one has 

||y- Ax„||2 = 0, 



which means that ( l34l i can be modified to 

||e„||2 < e. 



(35) 



Hence, the parameter e can be reduced to a half throughout the 
proof of Theorem 7. Therefore, if the initial value is chosen 
as the least square solution, the neighborhood of convergence 
will be smaller, i.e., a better estimation can be reached. 
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D. Discussions on p-compressible Signal 

The original signal is not always absolutely sparse. The 
reconstruction of compressible signal is discussed here. Signal 
X is p-compressible with magnitude R if the components of 
X decay as 

where a;(i) is the ith largest absolute value among the com- 
ponents of X, and p is a number between and 1. Supposing 
that X5 is a best S'-sparse approximation to x, the following 
inequalities hold 1341 , 



where Cp = (1/p- 1)"^ and Dp = {2/p-l)-^^^. 
For a p-compressible signal x, one has 



(36) 
(37) 



y = Ax + e = Ax5 + (A(x - xg) + e). 

By Proposition 3.5 in |fT3l . the norm of A(x — xg) can be 
estimated as 

1 



||A(x-xs)||2 < Vl + Ss f ||x-Xs||2 + ^||x-X5||i 

(38) 
Combining (O, ( |37] ) and ( |38] |. one has 



II A(x - xs) + ei|2 < v/rT^(Dp + Cp)-R- S^I'-^IP + e. 

According to Theorem 7 and Corollary 2, the reconstruction 
property of p-compressible signal by £i-ZAP can be deduced 
as follows. 

Corollary 3: Supposing x G K^ is p-compressible signal 
and the conditions of (|3]l and (|5]l are satisfied, then the £i-ZAP 
sequence can approach x with a deviation 



K-f + C'e + C'^lTs^iDp + Cp)-R- 5^/2-i/p^ 

where K and C" are the same with those in Corollary 2, and 
e is the energy bound of observation noise. 

The non-noisy scenario for compressible signal can be 
naturally obtained by setting e to zero in Corollary 3. 

V. Experiments 

Several experiments are conducted in this section. The 
performance of ^o-ZAP and £i-ZAP are shown in Subsection 
A, compared with several other algorithms for sparse recov- 
ery. The deviations of actual i'l-ZAP sequence and bound 
sequences in the proof are illustrated in Subsection B. In 
Subsection C, experiment results demonstrate the impacts of 
the step-size and the noise level on the signal reconstruction 
via £i-ZAP 

A. Performance of ZAP 

The performances of ^i-ZAP and ^o-ZAP are simulated, 
compared with other sparse recovery algorithms. 

In the experiments, the M x N matrix A is generated with 
the entries independent and following a normal distribution 
with mean zero and variance 1/M. The support set of original 
signal X* is chosen randomly following uniform distribution. 




Fig. 1 . Probability of exact reconstruction for various number of measure- 
ments, where N = 1000, S = 50. 



The nonzero entries follow a normal distribution with mean 
zero. Finally the energy of the original signal is normalized. 

For parameters N ~ 1000, S = 50, the probability of 
exact reconstruction for various number of measurements is 
shown as Fig. [T] If the reconstruction SNR is higher than a 
threshold of 40dB, the trial is regarded as exact reconstruction. 
The number of M varies from 140 to 320 and each point in 
the experiment is repeated 200 times. The step-size of £i-ZAP 
is 5 X lO^"'. The parameters of other algorithms are selected 
as recommended by respective authors. It can be seen that 
for any fixed M from 180 to 260, Iq-ZAP and £i-ZAP have 
higher probability of reconstruction than other algorithms, 
which means ZAP algorithms demand fewer measurements 
in signal reconstructions. The experiment also indicates that 
the performance of £o-ZAP is better than £i-ZAP, as discussed 
in Section II. 

For parameters N = 1000, M = 200, Fig. |2] illustrates the 
probability of exact reconstruction for various sparsity S from 
25 to 70. All the algorithms are repeated 200 times for each 
value. The parameters of algorithms are the same as those in 
the previous experiment. fp-ZAP has the highest probability 
for fixed sparsity S and £i-ZAP is the second beyond other 
conventional algorithms. The experiment indicates that ZAP 
algorithms can recover less sparse signals compared with other 
algorithms. 

The SNR performance is illustrated in Fig. |3] with the 
measurement SNR varying from 5dB to 30dB and 200 times 
repeated for each value. The noise is zero-mean white Gaus- 
sian and added to the observed vector y. The parameters are 
selected as TV = 1000, M = 200 and S = 30. The parameters 
of algorithms have the same choice with previous experiments. 
The reconstruction SNR and measurement SNR are the signal- 
to-noise ratios of reconstructed signal x and measurement 
signal y, respectively. £o-ZAP outperforms other algorithms, 
while £i-ZAP is almost the same as others. The experiment 
indicates that Iq-ZAP has a better performance against noise 
and £i-ZAP does not have visible defects compared with other 
algorithms. 

The experiments above demonstrate that £i-ZAP has a 
better performance compared with conventional algorithms. 
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Fig. 2. Probability of exact reconstruction for various sparsity, where N - 
1000, M = 200. 



Fig. 4. Reconstruction SNR of actual sequence and bound sequences with 
different choices of ^, where M = 250, A^ = 1000, 5 = 50, 7 = 5 X 10"". 
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Fig. 3. Reconstruction SNR versus measurement SNR, where N 
1000, M = 200, S = 30. 



Fig. 5. The value of /i — 1 throughout the iteration for adaptive fi. 



£i-ZAP demands fewer measurements and can recover signals 
with higher sparsity, with similar property against noise. The 
performance of ^q-ZAP is better than £i-ZAP. 

B. Actual Sequence and Bound Sequences 

According to Theorem 4, the deviation from the actual 
iterative sequence to the sparse solution is bounded by the 
sequence satisfying (l22l i. In Theorem 4, a sequence with 
parameter fj, is utilized to bound the actual sequence and 
proved to be convergent. As discussed in III-E and F, the 
sequence defined in ( |23] | and jTM with adaptive fj, approaches 
the sparse solution faster than any sequence with constant /i. 

The reconstruction SNR curves of the actual sequence 
and three bound sequences with different choices of /i are 
demonstrated in Fig. |4] As can be seen in the figure, the 
bound sequence with adaptive /i is the best estimation among 
different choices. For a constant /i, the larger one leads to 
faster convergence and less precision. 

For adaptive n, as illustrated in Fig. H) the reconstruction 
SNR reaches steady-state after about 2000 iterations. However, 
referring to Fig. |5] the value of fi keeps decreasing until over 
6000 iterations, though it impacts little to the convergence 



behavior In fact, adaptive /^i will decrease towards 1 through- 
out the iteration and never stop. Nevertheless, the precision 
of simulation platform limits its variation after it is below 
3 X 10-1-^. 

The deviations of the actual iterative sequence and a bound 
sequence are both proportional to the step-size, with the 
difference in the scale factor Though the bound is not very 
strict, it does well in the proof of the convergence of £i-ZAP. 

C. About Step-size and Noise 

As proved in Theorem 4, in non-noisy scenario, ^i-ZAP 
can reconstruct the original signal at arbitrary precision by 
choosing the step-size small enough. Theorem 7 demonstrates 
that in noisy scenario the reconstruction SNR is determined by 
both the step-size and noise level. Experiment results shown 
in Fig. |6] verify the analysis. Each combination of step-size 
and measurement SNR is simulated 100 times. Experiment 
results indicate that in non-noisy scenario, the reconstruction 
SNR increases as the decreasing of step-size. In noisy scenario, 
the reconstruction SNR can not increase arbitrarily due to the 
impact of noise. For small step-size, the reconstruction SNR is 
mainly determined by noise level. The reconstruction SNR is 
higher when the measurement SNR is higher For large step- 



PROOF OF CONVERGENCE AND PERFORMANCE ANALYSIS FOR SPARSE RECOVERY VIA ZERO-POINT ATTRACTING PROJECTION 



g 40 

m 

I 30 
o 

f 20 

o 

o 

ir 10 





- Measurement SNR=1 OdB 

- Measurement SNR=20dB 

- Measurement SNR=30dB 

- Non-noise 




-10 

10"' 10"' 10 ' 10"* 10 ° 

Step-size 

Fig. 6. Reconstruction SNR versus step-size for various SNR, where M ■ 
150, N = 1000, S = 20. 



size, the step-size mainly controls the reconstruction SNR and 
the reconstruction SNR increase as the decreasing of step-size. 
The figure also offers a way to choose the step-size under 
noise. It is not necessary to choose the step-size too small 
because it benefits little under the impact of noise. For an 
estimated reconstruction SNR, the best choice of step-size is 
the value just entered the flat region. 

VI. Conclusion 

This paper provides £i-ZAP a comprehensive theoretical 
analysis. Firstly, the mentioned algorithm is proved to be 
convergent to a neighborhood of the sparse solution with the 
radius proportional to the step-size of iteration. Therefore, it 
is non-biased and can approach the sparse solution to any 
extent and reconstruct the original signal exactly. Secondly, 
when the measurements are inaccurate with noise perturba- 
tion, ii-ZAP can also approach the sparse solution and the 
precision is linearly reduced by the disturbance power In 
addition, some related topics about the initial value and the 
convergence rate are also discussed. The convergence property 
of p-compressible signal by £i-ZAP is also discussed. Finally, 
experiments are conducted to verify the theoretical analysis 
on the convergence process and illustrate the impacts of 
parameters on the reconstruction results. 

VII. Appendix 

A. The Proof of Lemma 1 

Proof: It is to be proved that g(x) defined in ( fT3] l has a 
positive lower bound respectively for < ||x — x*||2 < r^ and 
''o ^ ||x — x*||2 < Mo, where tq is defined in ( fTSI ). 
Define sets Xi and X2 as 

A'l^jx I ro< ||x-x*||2<Mo}n{x | y = Ax}, 
A-s^jxl 0<||x-x*||2<ro}n{x|y = Ax}. (39) 

For X E Xi, the function g(x) is continuous for x and the 
domain is a bounded closed set. As a basic theorem in calculus, 
the value of a continuous function can reach the infinum if 
the domain is a bounded closed set. As a consequence, there 
exists an xq £ Xi, such that g(xo) = inf^t'i g(x). By the 



uniqueness of x* and the definition of g(x), g(x) is positive 
in Xi. Then g(xo) is positive and this leads to the conclusion 
that the infimum of g(x) is positive in Xi. 

On the other hand, it will be proved that g(x) has a positive 
lower bound for x G A2 . 

Any vector in the solution space of y = Ax* can be 
represented by 

X = X* + 7' • u (40) 



where 



r= llx-x II2, 

U= (X-X*)/||X-X*||2 



(41) 
(42) 



denote the distance and direction, respectively. Considering the 
definition of rg, one has 

(x*)Tsgn(x) = (x*)Tsgn(x*) = ||x*!|i. (43) 

Combining ( |40] | with ( |43] |. one gets 

||x||i = (x* + r ■ u)'^sgn(x) = ||x*||i + r ■ u'^sgn(x). (44) 

As a consequence, for < ?' < tq, the objective function can 
be simplified as 



5(x) 



|x||i-||x-Hi 

jlx — X*||2 



u'^sgn(x) 



JV 

E 

fc=i 



UkSgn{xk). (45) 



Index setl = {fc I x^, 7^ 0, 1 < fc < N} is the support set of 
X*. I'^ denotes the complement of I. For Vfc e I, considering 
the definition of rg, 

UfcSgn(xfc) = WfeSgn(xfc). 

For Vfc e I'^, considering a;^ = and the definition of u in 
63, 

UkSgn{xk) = UfcSgn(ufe) = \uk\- 

Consequently, g(x) can be rewritten as a function of u, 

gW = ^UkSgn{xk) + ^ MfeSgn(xfc) 
kex keX" 

= uJsgii(x*) + ||uxc||i^G(u), (46) 



where 



and 



(ui)i 



Ufc, k el; 
0, elsewhere, 



Ujc = u - Uj. 



It can be seen that G(u) is continuous for u and the domain 
of G(u) is {u e M^ I llujla = l}n{u e R^ | Au = 0}. Since 
the domain of G(u) is the intersection of two closed sets and 
the first set is bounded, it is a bounded closed set and G(u) 
can reach the infinum. Then g(x) has the minimum. By the 
uniqueness of x*, g(x) is positive, consequently mix2 g(x) > 
0. 

To sum up, the lower bound of g(x) is positive for 

xe A'iUA'2 = {x| 0< l|x-x*||2< A/o}n{x| y = Ax}, 
which completes the proof of Lemma 1 . ■ 
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B. The Proof of Theorem 4 
Proof: By denoting 



as the iterative deviation and subtracting the unique solution 
X* from both sides of (|9]l, one has 

l|h.„+illi 
= ||h„-7Psgn(x„)||^ 

= ||h„||2 _ 27hTpsgn(x„) +7'l|Psgn(x„)||i (47) 

According to (fTTT i. 

hTAT = (x„-x*)TAT = 0. 
Considering 

h^P = h^ - h:AT(AAT)-iA = h^, 
(x*)Tsgn(x„)<(x*fsgn(x*) = l|x*j|i 

and using Lemma 1, one can shrink the second item of (|47] | 
to 

h^sgn(x„) > ||x„||i - ||x*l|i > i||h„|i2. (48) 

Using (gSJ and (gTll, one has 

||h„+i||2 < ||h„||2 - 27t|jh„|l2 +7'llPsgn(x„)|l2. 
Consequently, for any /i > 1, if 

||h„||2 > A-7 = 7^ max ||Psgn(x)||2, 
one has 



Now let's turn to the right inequality of (|20] |. Because of 
the property of projection matrix, P = P^, the eigenvalue of 
P is either or 1. For all x, one has 

|lPsgn(x)||^=sgnT(x)Psgn(x) 

< max{Ap}sgn'''(x)sgn(x) 

= |jsgn(x)||2<7V, 

where {Ap} denotes the eigenvalue set of P. The arbitrariness 
of X leads to 



max |jPsgn(x)|l2 < VN. 
Therefore, Theorem 5 is proved. ■ 

D. The Proof of Lemma 2 

Proof: For A'^in satisfying ( |26l ), there exists /i' > 1 such 
that 

if min = T- max 1 1 Psgn(x) 1 1 ^ . (51) 

2t xeR" 

Considering the recursion of sequence {xjj} in dZSl l. it is 
expected to prove that 

||x:, - x*||2 - 27<||x:, - x*||2 + 7' max ||Psgn(x)||2 

xGR" 

<(||x'„-x*||2-7t(l-l/M'))', (52) 



when 



K-x*||2>7-?7max||Psgn(x)||2. (53) 



Using (l53l l. the difference between the left side and the right 
side of ( |52] | is 



||h„+i||^<||h„||2-d72 

= II h. 

Theorem 4 is proved 



hri||2 - 7^(m - 1) max ||Psgn(x)l|^. 



C. The Proof of Theorem 5 

Proof: Noticing that u is in the kernel of A and P is a 
symmetric projection matrix to the solution space, with (ITOl i 
and ( l42b . one has 

Pu = u. 

Because u is a unit vector, it can be further derived that 

u'^sgn(x) = (Pu)'^sgn(x) 
= (u, Psgn(x)) 
Psgn(x) 



max ||Psgn(x)||2 - t^ {I - l/i_i'f 

xGR« 

- 27t||x; - x*||2/m' < -7'i' (1 - I/a^')' < 0. (54) 
As a consequence, i52\ holds and it leads to 

||x:,+i - x*||2 < ||x:, - x*||2 - 7i (1 - 1/m') • (55) 



According to (1551 1. the quantity of decrease by each step is at 
least 7i(l - 1/fi'). 

Considering that {x„} has a faster convergence rate than 
that of {x^}, and the trip of {x„} is from (ii'niax7)-ball to 
(Amin7)-ball, consequently the iteration number is at most 



{Kn 



Kminh 



2(X„ 



^^ mill ) 



7i(i - l/^i') 



2t- max |jPsgii(x)||S/A"r 

xeR" 



< 



Psgn(x)||2 
Psgn(x)||2. 



, Psgn(x) 



(49) 



Consider the definition of t in ( fT2] i and (l45T l. 



t< inf g(x) < inf g(x) < uTsgn(x), (50) 
xeA;iUA^2 xeA^2 

where Xi and X2 are defined in ( [39] l. Combining (|49] l and 
(|50] |. consequently, the left inequality of (|20] | is proved. 



E. The Proof of Theorem 6 

Proof: According to Lemma 2, the iteration num- 
ber needed from {{n + l)Aro7)-neighborhood to (nii'07)- 
neighborhood is at most 



lU 



Ko 



1 



1 



f - max ||Psgn(x)||2/(2nA'o) t \ ^qu - 1 J ' 

xeR" 

(56) 
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where 



A'o = ^max||Psgn(x)||^ 
Zt xeR" 



and /io is larger than 1. 

Assume that Mq = ||xo — x*||2 obeys 

mA'o7 < Mo < (m + l)/l'o7, 



where m is a positive integer. Utilizing (I56b . the total iteration 
number from A/q -neighborhood to (A'o7)-neighborhood is at 
most 



Ko 



El 



n=l 



1 



I^LqU - 1 



which is less than 

Mq ^ Kg^^fMo 

tl t "V^'o7 



2irn 



2t - max j|Psgn(x)||2/A'o 

xGR" 



(57) 



(58) 



Thus Theorem 6 is proved. The relation between 
comes from the following plain algebra, 



and 



(Eli< 



< 



<- 



t 

Ko 

t 
Ao 

t 

Mo 
tl 



fJ-o 



-1 ^Wo(n-l 



L) 



Mo 
1 



^ + — (hi(TO-l) + l) 
- 1 A^o 



771 H In (m — 1) 

Mo 

Mo 



1 



t 



A'o7 



Mo - 1 
-?^o Mo 
i /io - 1 



1 

Mo 



of the eigenvalues are zeros. Consequently, one has 

maxjAB} = tr(B) = tr ('e;^(AA'^)-^AA'^(AA'^)-ie„^ 

= eT(AAT)-ie„ 

< niax{A(AAT)-i}e^e„ < 4e2A, (62) 

where the last step can be derived by 

||e„||2 = ||(y-Ax„)-(y-Ax*)||2 

<||y-Ax„||2 + ||y-Ax*||2<2£. (63) 

It can be easily seen that maxjAjAAT-,-!} is positive, if AA 
is an invertible matrix. 

Because e^(AA )^^Asgn(x„) is a scalar, combining (l6Tl i 
and ( |62] i. one has 

|e^(AAT)-iAsgn(x„)| < 2eVNX. 



For Vm > 1, if 

||x„ - x*||2 > 7 ■ 1^ max |jPsgn(x)|j2 
2t xeR" 



(64) 



^NX, (65) 



using ( |60l l, ( |64l ) and ( |65] ). we have 



Tx-l 



2(x„ - x*) ' sgn(x„) - 2e,-; (AA )^^ Asgn(x„) 

>7/i max ||Psgn(x)||2. 

xeR" 



(66) 



Combining ( |59] | and ( |66] |. it can be concluded that under 
the condition of ( 1651 ), 

||x„+i - x*||^ < i|x„ - x.*\\l - j'^ifi - 1) max ||Psgn(x)||^. 



xGl 



Then Theorem 7 is proved. 



F. The Proof of Theorem 7 

Proof: Similar to WT\ . by defining h'^ = x„ — x* and 
e„ = A(x„ — X*), the deviation iterates by 



IK+il 



From Lemma 3 and referring to (HFt . one has 



|h:j|2_27K'sgn(x„) 
+ 27eT(AAT)-iAsgn(x„) 
+ 7'||Psgii(x„)||i 



K sgn(x„) > illK, 



(59) 



(60) 



Next the third item of (|59] l will be studied. By the property 
of symmetric matrices. 



||eT(AAT)-iAsgn(x„)i|2 
=sgiiT(x„)AT(AAT)-ie„eT(AAT)-iAsgii(x„) 
=sgn'^(x„)Bsgn(x„) 
<max{AB}sgn'^(x„)sgn(x„) < A^maxjAe}, (61) 



where 



T\-l 



B = A^(AA')-^e„e,UAA')-^A 



and {Ab} denote its eigenvalues. Notice that eJ^(AA ) ^ A G 
R^^^, therefore rank(B) is at most one, and at least A^ — 1 
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